{ "cells": [ { "cell_type": "markdown", "id": "c28a32e9-e5e7-4f17-8384-bd9e752b340e", "metadata": {}, "source": [ "# DMP and DMR" ] }, { "cell_type": "code", "execution_count": null, "id": "58882ee2-6b75-4196-8bbc-b64ae36eee04", "metadata": {}, "outputs": [], "source": [ "from pylluminator.samples import Samples\n", "from pylluminator.visualizations import manhattan_plot_dmr, plot_dmp_heatmap, visualize_gene\n", "from pylluminator.dm import get_dmp, get_dmr, get_top_dmrs\n", "from pylluminator.utils import save_object, load_object\n", "\n", "from pylluminator.utils import set_logger\n", "\n", "set_logger('WARNING') # set the verbosity level, can be DEBUG, INFO, WARNING, ERROR" ] }, { "cell_type": "markdown", "id": "71bcd367-d86f-4f20-b19a-7f8d75e9d39f", "metadata": {}, "source": [ "## Load pylluminator Samples\n", "\n", "We assume that you have already processed the .idat files according to your preferences and saved them. If not, please refer to notebook `1 - Read data and get beta values` before going any further." ] }, { "cell_type": "code", "execution_count": null, "id": "ae0bd9d9-752f-40d7-b93c-0f41f709ba3d", "metadata": {}, "outputs": [], "source": [ "my_samples = Samples.load('preprocessed_samples')" ] }, { "cell_type": "markdown", "id": "c511cfe4-f639-4925-9e15-916633d95b1d", "metadata": {}, "source": [ "Here, we want to filter out the probes on the X or Y chromosomes." ] }, { "cell_type": "code", "execution_count": null, "id": "9000c4d1-6154-4908-b851-8a7f0f91e0a5", "metadata": {}, "outputs": [], "source": [ "my_samples.mask_xy_probes()" ] }, { "cell_type": "markdown", "id": "a19fb496-6ff3-403a-b3c4-9be90d7af6e7", "metadata": {}, "source": [ "To speed up the demo, we will only calculate DMP and DMR on 10% of the probes" ] }, { "cell_type": "code", "execution_count": null, "id": "0c2b4526-52a5-45b8-af6e-f1293b66eafb", "metadata": {}, "outputs": [], "source": [ "ten_pct_probes = int(0.1 * my_samples.nb_probes)\n", "probe_ids = my_samples.probe_ids[:ten_pct_probes]\n", "print(f'Selected {ten_pct_probes:,} first probes')" ] }, { "cell_type": "markdown", "id": "433f4b97-cb16-47f9-b4af-215d2d7800dc", "metadata": {}, "source": [ "## Differentially Methylated Probes\n", "\n", "The second parameter of get_dmp() is a R-like formula used in the design matrix to describe the statistical model, e.g. '~age + sex'. The names must be the column names of the sample sheet provided as third parameter\n", "\n", "More info on design matrices and formulas:\n", "- https://www.statsmodels.org/devel/gettingstarted.html\n", "- https://patsy.readthedocs.io/en/latest/overview.html\n" ] }, { "cell_type": "code", "execution_count": null, "id": "c11c9f7d-a965-4780-8063-1b21b5150b62", "metadata": {}, "outputs": [], "source": [ "my_samples.sample_sheet" ] }, { "cell_type": "code", "execution_count": null, "id": "8b468f1a-92ae-45bb-b0bc-aae79b54bca1", "metadata": {}, "outputs": [], "source": [ "dmps, contrasts = get_dmp(my_samples, '~ sample_type', probe_ids=probe_ids)\n", "contrasts" ] }, { "cell_type": "markdown", "id": "43e4c278-f7f6-4d8e-a27e-83fafdc3eb57", "metadata": {}, "source": [ "You can now plot the results, for the 25 most variable probes:" ] }, { "cell_type": "code", "execution_count": null, "id": "13381265-f0d1-4b9a-b964-c78d7bd6b21c", "metadata": {}, "outputs": [], "source": [ "plot_dmp_heatmap(dmps, my_samples, contrasts[0], nb_probes=25)" ] }, { "cell_type": "markdown", "id": "e48af4ef-e130-462f-b758-8758fa758d73", "metadata": {}, "source": [ "## Differentially Methylated Regions" ] }, { "cell_type": "code", "execution_count": null, "id": "c2bcc0e8-70bc-44bb-9ec0-bb8e5fc25336", "metadata": {}, "outputs": [], "source": [ "dmrs = get_dmr(my_samples, dmps, contrasts, probe_ids=probe_ids)" ] }, { "cell_type": "code", "execution_count": null, "id": "a4db8026-e88e-4436-a095-ed28972de457", "metadata": {}, "outputs": [], "source": [ "manhattan_plot_dmr(dmrs, contrasts[0], annotation=my_samples.annotation) " ] }, { "cell_type": "code", "execution_count": null, "id": "24b861a6-814f-452c-91f1-67a50ba30490", "metadata": {}, "outputs": [], "source": [ "get_top_dmrs(dmrs, my_samples.annotation, contrasts[0])" ] }, { "cell_type": "markdown", "id": "382314fe-5f84-4eae-b449-a2249f45a3e9", "metadata": {}, "source": [ "## Gene visualization\n", "\n", "Let's have a look at a particular gene identified as differentially methylated" ] }, { "cell_type": "code", "execution_count": null, "id": "775af73e-b5f5-4950-91a3-3f5378aa78a7", "metadata": {}, "outputs": [], "source": [ "visualize_gene(my_samples, 'FOXL2')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }